The Indifferent Naive Bayes Classifier

نویسندگان

  • Jesús Cerquides
  • Ramon López de Mántaras
چکیده

The Naive Bayes classifier is a simple and accurate classifier. This paper shows that assuming the Naive Bayes classifier model and applying Bayesian model averaging and the principle of indifference, an equally simple, more accurate and theoretically well founded classifier can be obtained. Introduction In this paper we use Bayesian model averaging and the principle of indifference to derive an improved classifier which we name Indifferent Naive Bayes classifier (IndifferentNB from now on). First we introduce the Naive Bayes model, paying special attention to its conditional independence assumptions and to the estimation of its parameters. Second, we introduce Naive distributions and show that they are conjugate with respect to the Naive Bayes model and that they can be integrated in closed form to get averaged predictions. Third, we apply the principle of indifference, getting the final expression for IndifferentNB. Fourth, we perform an empirical comparison of IndifferentNB with the standard implementation of Naive Bayes and the one proposed in (Kontkanen et al. 1998) showing that IndifferentNB reduces the classification error rate and approximates the probabilities better, specially when few data is available. We finish with some conclusions and possibilities for future research. The Naive Bayes model The Naive Bayes classifier (Langley, Iba, & Thompson 1992) is a classification method based on the assumption of conditional independence between the different variables in the dataset given the class. Following the notation in (Cowell et al. 1999), being X , Y and Z random variables we will write X ⊥⊥ Y|Z for “X is conditionally independent on Y given Z”. In this notation, the Naive Bayes model states that ∀i, j 1 ≤ i, j ≤ n ; Ai ⊥⊥ Aj |C (1) The Naive Bayes model as a Bayesian network As can be seen in (Cowell et al. 1999) and in (Friedman, Geiger, & Goldszmidt 1997) in terms of Bayesian networks, Copyright c © 2003, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. the Naive Bayes model can be represented as the network in Figure 1. The Bayesian network in Figure 1 is not the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

Naive Credal Classifier 2: a robust approach to classification for small and incomplete data sets

Naive Credal Classifier, which is an imprecise-probability counterpart of Naive Bayes, is rigorously extended to a very general and flexible treatment of incomplete data, yielding a new classifier called Naive Credal Classifier 2 (NCC2). The new classifier delivers classifications that are robust to the presence of small sample sizes and missing values. In particular, some empirical evaluations...

متن کامل

Incremental Weighted Naive Bays Classifiers for Data Stream

A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes’ theorem with naive independence assumption. The explanatory variables (Xi) are assumed to be independent from the target variable (Y ). Despite this strong assumption this classifier has proved to be very effective on many real applications and is often used on data stream for supervised classification. The n...

متن کامل

Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data

The problem of secure distributed classification is an important one. In many situations, data is split between multiple organizations. These organizations may want to utilize all of the data to create more accurate predictive models while revealing neither their training data / databases nor the instances to be classified. The Naive Bayes Classifier is a simple but efficient baseline classifie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003